1 Exploring flexsurv

1.1 Introduction

\(\texttt{flexsurv}\) is an R package for parametric survival analysis. The CRAN vignette provides a detailed list of the package’s functionality, while this RSS guide gives a more readable overview (and the source file provides all the code).

The main features of the packages are:

  • Fit any parametric survival distribution, given a pdf or hazard function (as well as their cumulative counterparts, preferably);
  • Built-in commonly used survival distributions;
  • Model any parameter of any distribution as a linear or log-linear function of the covariates.

While the semi-parametric Cox model is very popular, primarily because the effects of covariates can be estimated without requiring knowledge of the baseline survival distribution, there are advantages to fully-parametric approaches in some cases.

The \(\texttt{flexsurv}\) package provides a set of tools for parametric modelling. The syntax is designed to be consistent with the \(\texttt{survival}\) package (the standard package for survival analysis in R) but \(\texttt{flexsurv}\) goes beyond \(\texttt{survival}\) in that it allows for more flexible modelling, e.g. \(\texttt{survival}\) only supports two-parameter location-scale distributions.

1.2 Parametric survival models

1.2.1 General model

The general model fitted by \(\texttt{flexsurv}\) has a PDF for death at time \(t\) given by \[\begin{equation}\label{eqn:general_model} f(t|\mu(\boldsymbol{z}),\boldsymbol{\alpha}(\boldsymbol{z})), \quad t\geq 0 \end{equation}\] We define also the CDF \(F(t)\), the survival function \(S(t)=1-F(t)\), the hazard function \(h(t)=f(t)/S(t)\) and the cumulative hazard \(H(t)=-\log S(t)\). The parameter of primary interest (usually a location parameter) is \(\mu=\alpha_0\), while the parameters \(\boldsymbol{\alpha}=(\alpha_1,\ldots,\alpha_R)\) are ancillary (usually related to higher moments).

The parameters generally depend on the covariate vector \(\boldsymbol{z}\) through link-transformed linear models \[\begin{align} g_0(\mu(\boldsymbol{z})) &=\gamma_0+\boldsymbol{\beta}_0^T\boldsymbol{z} \label{eqn:link_function_1} \\ g_r(\alpha_r(\boldsymbol{z}))&=\gamma_r+\boldsymbol{\beta}_r^T\boldsymbol{z} \label{eqn:link_function_2} \end{align}\] Typically \(g(\cdot)\) will be identity for unrestricted parameters and \(\log(\cdot)\) for positive parameters.

1.2.2 Proportional hazards model

If the location parameter depends on the covariates, but the ancillary parameters don’t then the hazard function factorises as \[h(t|\mu(\boldsymbol{z}),\boldsymbol{\alpha})=h_0(t|\boldsymbol{\alpha})\mu(\boldsymbol{z}).\] This is a proportional hazards (PH) model: for any two individuals with covariates \(\boldsymbol{z}_1\), \(\boldsymbol{z}_2\), the hazard ratio \[\frac{h_1(t|\mu(\boldsymbol{z}_1),\boldsymbol{\alpha})}{h_1(t|\mu(\boldsymbol{z}_2),\boldsymbol{\alpha})}=\frac{h_0(t|\boldsymbol{\alpha})\mu(\boldsymbol{z}_1)}{h_0(t|\boldsymbol{\alpha})\mu(\boldsymbol{z}_2)}=\frac{\mu(\boldsymbol{z}_1)}{\mu(\boldsymbol{z}_2)}\] is constant over time.

1.2.3 Acclerated failure time (AFT) model

If \(S(t|\mu(\boldsymbol{z}),\boldsymbol{\alpha})=S_0(\mu(\boldsymbol{z})t|\boldsymbol{\alpha})\) then it is an accelerated failure time (AFT), where the covariate effect is multiplicative with respect to the survival time.

1.2.4 Data

Let \(\{(t_i,\delta_i,s_i)\}_{i=1}^n\) be a set of survival data, where

  • \(t_i\) denotes the event/censoring time of individual \(i\);
  • \(\delta_i\) is the censoring indicator for individual \(i\), i.e. \(\delta_i=1\) if \(t_i\) is an observed event time and \(\delta_i=0\) if \(t_i\) represents the (right) censoring time;
  • \(s_i\) is the left-truncation time for individual \(i\), i.e. the \(i\)th survival time is only observed conditionally on the individual having survived up to time \(s_i\). If there is no left-truncation, then \(s_i=0\).

1.2.5 Likelihood

The likelihood for the parameters \(\boldsymbol{\theta}=(\boldsymbol{\gamma},\boldsymbol{\beta})\) from Equation is given by \[\begin{equation}\label{eqn:lkhd} l(\boldsymbol{\theta}|\boldsymbol{t},\boldsymbol{\delta},\boldsymbol{s})=\frac{\prod_{i:\,\delta_i=1}f_i(t_i)\prod_{i:\,\delta_i=1}S_i(t_i)}{\prod_{i}S_i(s_i)}. \end{equation}\] where \[\begin{align*} f_i(t_i) &= f_i(t_i|\mu(\boldsymbol{z}_i),\boldsymbol{\alpha}(\boldsymbol{z}_i)) \\ S_i(t_i) &= S_i(t_i|\mu(\boldsymbol{z}_i),\boldsymbol{\alpha}(\boldsymbol{z}_i)) \\ \end{align*}\] for brevity and \(\mu,\boldsymbol{\alpha}\) are related to \(\gamma,\boldsymbol{\beta},\boldsymbol{z}_i\) via the link functions in Equations and . Note that if there is no left-truncation, then the denominator of Equation equals one.

The log-likelihood can be expressed in a more useful form involving the hazard and cumulative hazard functions, given by \[\begin{equation}\label{eqn:log_lkhd} \log l(\boldsymbol{\theta}|\boldsymbol{t},\boldsymbol{\delta},\boldsymbol{s})=\sum_{i:\,\delta_i=1}\log h_i(t_i) - \sum_{i}H_i(t_i) +\sum_{i} H_i(s_i). \end{equation}\] Again we suppress the conditionalities for brevity and in the case where there is no left-truncation the final term vanishes.

For these likelihoods to be valid, we assume the individual survival times are independent and the censoring times are distributed independently of the parameters \(\boldsymbol{\theta}\).

1.3 Fitting parametric survival models in \(\texttt{flexsurv}\)

1.3.1 The \(\texttt{bc}\) dataset

To demonstrate the functionality of \(\texttt{flexsurv}\), we use the in-built dataset \(\texttt{bc}\), which contains 686 patients with breast cancer.

censrec rectime group recyrs
0 1342 Good 3.676712
0 1578 Good 4.323288
0 1760 Good 4.821918
0 1152 Good 3.156164
0 967 Good 2.649315
0 629 Good 1.723288

The columns of the dataset are as follows:

  • \(\texttt{censrec}\) is the censoring indicator (1=dead, 0=censored);
  • \(\texttt{rectime}\) is the death/censoring time in days;
  • \(\texttt{group}\) is the prognostic group, either “Good”, “Medium” or “Poor”;
  • \(\texttt{recyrs}\) is the death/censoring time in years.

1.3.2 Fitting a model

The main model-fitting function is \(\texttt{flexsurvreg}\). The first argument is an R formula, of which the left-hand side is an R survival object, created using the \(\texttt{Surv}\) function from the \(\texttt{survival}\) package.

fs1 <- flexsurvreg(Surv(recyrs,censrec) ~ group, # the 'formula'
                   data = bc, # the data set
                   dist = "weibull") # specify the parametric distribution

If the \(\texttt{dist}\) argument is a string, then it uses a built in parametric distribution; otherwise we can specify a custom distribution. If we have left-truncation times as variable in our data set, we add this as the first argument in the \(\texttt{Surv}\) call.

Printing the fitted model gives useful information, including the parameter estimates and associated confidence intervals.

## Call:
## flexsurvreg(formula = Surv(recyrs, censrec) ~ group, data = bc, 
##     dist = "weibull")
## 
## Estimates: 
##              data mean  est      L95%     U95%     se       exp(est)  L95%   
## shape             NA     1.3797   1.2548   1.5170   0.0668       NA        NA
## scale             NA    11.4229   9.1818  14.2110   1.2728       NA        NA
## groupMedium   0.3338    -0.6136  -0.8623  -0.3649   0.1269   0.5414    0.4222
## groupPoor     0.3324    -1.2122  -1.4583  -0.9661   0.1256   0.2975    0.2326
##              U95%   
## shape             NA
## scale             NA
## groupMedium   0.6943
## groupPoor     0.3806
## 
## N = 686,  Events: 299,  Censored: 387
## Total time at risk: 2113.425
## Log-likelihood = -811.9419, df = 4
## AIC = 1631.884

To know what parametrisation has been used, consult the documentation (they can differ from those used by \(\texttt{survreg}\) from the \(\texttt{survival}\) package).

\(\texttt{flexsurvreg}\) has a range of built-in distributions. For distributions whose density functions are provided in the standard R installation, \(\texttt{flexsurvreg}\) will use these functions (\(\texttt{dexp}\), \(\texttt{dweibull}\), etc.) in internal calculations, and will use the same parametrisation. \(\texttt{flexsurv}\) provides a range of additional survival distributions; consult the documentation for details.

1.3.3 Covariates on ancillary parameters

We now fit two generalised gamma models (a three-parameter family of distributions which generalises the Weibull, gamma, and log-normal) to the \(\texttt{bc}\) dataset.

The first model, \(\texttt{fs2}\), is an AFT model where only the parameter \(\mu\) depends on the prognostic covariate \(\texttt{group}\). In the second model, \(\texttt{fs3}\), the first ancillary parameter \(\texttt{sigma}\) also depends on the covariate (the resulting model is neither PH nor AFT). The second ancillary parameter \(\texttt{Q}\) is common for all prognostic groups.

fs2 <- flexsurvreg(Surv(recyrs,censrec)~group, data = bc, dist = "gengamma")
fs3 <- flexsurvreg(Surv(recyrs,censrec)~group+sigma(group), data = bc, dist = "gengamma")

1.3.4 Plotting outputs

The \(\texttt{plot()}\) method can be applied to \(\texttt{flexsurvreg}\) objects. This will draw the Kaplan-Meier estimate of the survival function (one for each combination of covariates, or just a single population average curve if there are no categorical covariates), and overlays the corresponding estimates from the fitted models. Further model fits can be added using the \(\texttt{lines()}\) method (and this is the approach that should be taken for publication level plots, the \(\texttt{plot()}\) method is only intended for model fit checking purposes). By changing the \(\texttt{type}\) argument we can plot the hazard or cumulative hazard function instead.

1.3.5 Model summaries

Any function \(\texttt{fn}\) of the parameters of a fitted model can be summarised by passing \(\texttt{fn}\) into the \(\texttt{summary()}\) method. Confidence intervals are obtained by bootstrap sampling, the argument \(\texttt{B}\) can be adjusted to obtain the desired level of precision.

1.3.6 Computation

\(\texttt{flexsurvreg}\) maximises the likelihood using the R function \(\texttt{optim}\). The derivatives for standard distributions are built in. For custom distributions, the user can optionally supply functions to calculate the log density and log survival functions (with respect to the transformed baseline parameters \(\boldsymbol{\gamma}\)).

1.3.7 Custom survival distributions

\(\texttt{flexsurv}\) can work with any survival model of the form (), if either the density or hazard function is provided. To supply a custom distribution to \(\texttt{flexsurvreg}\), the \(\texttt{dist}\) argument should be a \(\texttt{list}\) object containing the name of the distribution (\(\texttt{name}\)) and at least one of the density function (\(\texttt{d<name>}\)) or the hazard function (\(\texttt{h<name>}\)). If analytic forms for the cumulative distribution or cumulative hazard are available, these should be supplied too, as (\(\texttt{p<name>}\) and \(\texttt{H<name>}\) respectively (if not, \(\texttt{flexsurv}\) will compute them internally using numerical integration, which makes model fitting take longer and potentially become unstable).

All the functions supplied must be vectorised and accept an argument \(\texttt{log}\), which when \(\texttt{TRUE}\), returns the log density/hazard. If the desired functions are available in another contributed R package, they can be used, but a wrapper may be needed to convert them into an acceptable form for \(\texttt{flexsurvreg}\).

2 Implementing APGW in \(\texttt{flexsurv}\)

2.1 Adapted power generalised Weibull (APGW)

APGW is a four-parameter distribution proposed by Burke et al. (2019). They suggest modelling the CH function by \[H(t)=\lambda H_0((\phi t)^\gamma;\kappa)\]
where

  • \(H_0(\cdot;\kappa)\) is an underlying CH function with shape parameter \(\kappa>-1\);
  • \(\phi>0\) is the AFT parameter, controlling the horizontal scaling of the hazard;
  • \(\lambda>0\) is the PH parameter, controlling the vertical scaling of the hazard;
  • \(\gamma>0\) is a shape parameter, explicitly defined as a power parameter (unlike \(\kappa\) which can be incorporated in more flexible and complicated ways).

In particular, by fixing \(\lambda=1\) or \(\phi=1\) we obtain two important sub-models: \[\underbrace{H(t)=H_0((\phi t)^\gamma;\kappa)}_{\text{AFT model ($\phi$-regression)}}\qquad \underbrace{H(t)=\lambda H_0(t^\gamma;\kappa)}_{\text{PH model ($\lambda$-regression)}}\]

A specific choice for \(H_0(t^\gamma;\kappa)\) is proposed, namely \[H_A(t;\gamma,\kappa)=\frac{\kappa+1}{\kappa}\left\lbrace\left(1+\frac{t^\gamma}{\kappa+1}\right)^\kappa -1 \right\rbrace\] with corresponding hazard function \[h_A(t;\gamma,\kappa)=\gamma t^{\gamma -1}\left(1+\frac{t^\gamma}{\kappa+1}\right)^{\kappa-1}.\]

The APGW encompasses a broad range of popular survival distributions (e.g. Weibull, log-logistic, Gompertz) and has a more mathematically/computationally tractable form than other generalised distributions (e.g. generalised gamma).

In the most general case, we can consider the APGW multiparameter regression model given by \[\begin{align*} \log(\phi)&=x^T\tau \\ \log(\lambda)&=x^T\beta \\ \log(\gamma)&=x^T\alpha \\ \log(\kappa+1)&=x^T\nu \end{align*}\] where the log-link functions are used to ensure strict positivity of \(\phi\), \(\lambda\), \(\gamma\) and \(\kappa+1\). Here \(x=(1,x_1,\ldots,x_p)^T\) is a vector of covariates and \(\tau=(\tau_0,\tau_1,\ldots,\tau_p)^T\), \(\beta=(\beta_0,\beta_1,\ldots,\beta_p)^T\), \(\alpha=(\alpha_0,\alpha_1,\ldots,\alpha_p)^T\), \(\nu=(\nu_0,\nu_1,\ldots,\nu_p)^T\) are corresponding vectors of regression coefficients. In practice, some of the parameters may only depend on selected covariates, in which case simply set the appropriate coefficients equal to zero.

The general recommendation is as follows:

  • Only one scale parameter (\(\phi\) or \(\lambda\)) depends on the covariates, while the other is fixed equal to 1;
  • The power parameter \(\gamma\) may depend on covariates;
  • The \(\kappa\) parameter is constant, i.e. \(\nu_0\) is the only non-zero component of \(\nu\).

2.2 Implementing APGW as a custom distribution in \(\texttt{flexsurv}\)

To define a custom distribution we need the following:

  • \(\texttt{name}\): the name of the distribution, a string;
  • Either the density function \(\texttt{d<name>}\) or the hazard function \(\texttt{h<name>}\), and, if available analytically, the corresponding cumulative function (i.e. the distribution function \(\texttt{p<name>}\) or the cumulative hazard function \(\texttt{H<name>}\)). All these functions must be vectorised;
  • \(\texttt{pars}\): character vector naming the parameters, matching the arguments of the functions and in the same order;
  • \(\texttt{location}\): name of the location parameter (in the \(\texttt{flexsurv}\) sense), a character;
  • \(\texttt{transforms}\): list of functions transforming the parameters to the real line;
  • \(\texttt{inv.transforms}\): list of the corresponding inverse functions;
  • \(\texttt{inits}\): a function specifying plausible initial values for the parameters in terms of the data.
# hazard function
hapgw <- function(x, phi, lambda, gamma, kappa, log=FALSE){
  lambda*gamma*phi^gamma*(x)^{gamma-1}*(1+((phi*x)^gamma)/(kappa+1))^(kappa-1)
}

# cumulative hazard function
Hapgw <- function(x, phi, lambda, gamma, kappa, log=FALSE){
  lambda*(kappa+1)/kappa * ((1 + (phi*x)^gamma/(kappa+1))^kappa - 1)
}
  
# function to transform kappa to real line (and the inverse)
kappa_transform <- function(x){log(x+1)}
kappa_inv_transform <- function(x){exp(x)-1}

custom.apgw <- list(name="apgw",
                    pars=c("phi","lambda","gamma","kappa"),
                    location="lambda",
                    transforms=c(log,log,log,kappa_transform),
                    inv.transforms=c(exp,exp,exp,kappa_inv_transform),
                    inits=function(t)c(1,1,1,1)
                    )

The initial parameter values are set as 1. This is because \(\kappa=\gamma=1\) corresponds to the exponential distribution (a reasonable choice to begin with), and sets the default value for \(\lambda\) and \(\phi\) to one, meaning we can restrict the model to the PH or AFT sub-models simply by passing \(\texttt{fixedpars}=1\) or \(\texttt{fixedpars}=2\) respectively when we call \(\texttt{flexsurvreg}\).

Now fit the APGW model to the \(\texttt{bc}\) dataset. Here we use the AFT model, i.e. fix \(\lambda=1\) and allow \(\phi\) to depend on the covariates. The power parameter \(\gamma\) also depends on the covariate, but \(\kappa\) is a constant.

fs_apgw_aft <- flexsurvreg(Surv(recyrs,censrec) ~ group + gamma(group),
                   fixedpars=2, # fix lambda=1 (1 is init value)
                   data = bc, 
                   dist = custom.apgw) # specify my custom dist

2.3 Checking the APGW model fit

To test the APGW model, we fit some alternative models for comparison:

  • Weibull (AFT) - only the scale parameter depends on the covariates;
  • Generalised gamma (AFT) - only the parameter \(\mu\) depends on the covariates.
fs_weibull_aft <- flexsurvreg(Surv(recyrs,censrec) ~ group, data=bc, dist="weibull")
fs_gg_aft <- flexsurvreg(Surv(recyrs,censrec) ~ group, data=bc, dist="gengamma")

The plots below show the various model fits.

Comments about the survival functions:

  • The APGW and generalised gamma models give very similar fits, and fit the data (i.e. the non-parameter Kaplan-Meier estimate for the survival function) well.
  • The 95% CI for the APGW model appear to be accurate for ech group.
  • The performance of the Weibull model also appears to be OK, although the shape parameter for the ‘Poor’ group appears to be off, overestimating survival at early times, and underestimating later on (we see from the hazard plots that actually the issue is that the shape of the hazard function for the ‘Poor’ group is one that the Weibull model is unable to capture).

Comments about the hazard functions:

  • Unlike the Weibull model, the APGW and generalised gamma models are capable of modelling a greater variety of hazard shapes. This advantage is utilised for the ‘Poor’ and ‘Good’ groups, where the hazard function has a \(\cap\) shape.

3 Simulation studies: PH and PO models

3.1 Introduction

In this section, we run simulation studies to investigate proportional hazards (PH) and proportional odds (PO) models. We want to discover qualitative features in the model fit that hint at an incorrect choice of assumption (i.e. we have fit a PH model when the data is PO, or vice versa). To do this, we simulate PH and PO datasets (each with APGW baseline), fit the two types of models to each dataset, and study the qualitative features of the survival/hazard/cumulative hazard function plots.

3.2 Simulating APGW data

The experiments will involve simulating PH and PO data with APGW baselines. We can simulate APGW variates using inversion sampling. Recalling the cumulative hazard for the APGW \[H(t)=\lambda\frac{\kappa+1}{\kappa}\left[\left(1+\frac{(\phi t)^\gamma}{\kappa+1}\right)^\kappa -1\right]\] then the CDF is given by \[F(t)=1-\exp(-H(t))=1-\exp\left\lbrace -\lambda\frac{\kappa+1}{\kappa}\left[\left(\frac{1+(\phi t)^\gamma}{\kappa+1}\right)^\kappa -1\right]\right\rbrace.\] and we find that the inverse CDF is given by \[F^{-1}(t)=\frac{1}{\phi}\left\lbrace(\kappa+1)\left[\left(\frac{-\kappa}{\lambda(\kappa+1)}\log(1-t)+1\right)^{\frac{1}{\kappa}}-1\right]\right\rbrace^{\frac{1}{\gamma}}.\]

The inversion sampling algorithm for generating \(t\sim\mathrm{APGW}\) is as follows:

  1. Generate \(u\) from \(U\sim\mathrm{U}(0,1)\).
  2. Set \(t=F^{-1}(1-u)\).

(Note: we typically set \(t=F^{-1}(u)\) but using \(1-u\) is slightly cheaper computationally and gives the same result, since \(1-u\) is also from \(U(0,1)\).)

Now we write an R function to simulate \(n\) APGW variates, using exactly the same syntax as the analogous functions \(\texttt{rexp}\), \(\texttt{runif}\) etc.

rapgw <- function(n, phi, lambda, gamma, kappa){
  u <- runif(n=n, min=0, max=1)
  x <- (1/phi) * ((kappa+1)*((-kappa/(lambda*(kappa+1))*log(u)+1)^(1/kappa)-1))^(1/gamma)
  return(x)
}

3.3 Simulating PH data with APGW baseline

Under the PH assumption, the hazard function for group \(i=1,\ldots,k\) is given by \[h_i(t)=h_0(t)\theta_i,\] where \(h_0(t)\) is the baseline hazard function and \(\theta_i>0\) is the hazard ratio between group \(i\) and the baseline.

The following function simulates \(n\) variates from a distribution that has odds ratio \(\theta\) relative to an \(\mathrm{APGW}(\phi,\lambda,\gamma,\kappa)\) baseline.

rapgwph <- function(n, theta, phi, lambda, gamma, kappa){
  rapgw(n, phi, theta*lambda, gamma, kappa)
}

In these experiments, we simulate data for \(k+1\) groups. We fix one group to be the ‘baseline’ group and the survival functions for the other groups are restricted by the PH assumption.

sim_apgwph_df <- function(k, N, theta, phi, lambda, gamma, kappa, t_max){
  
  group <- rep(seq(0,k), each=N)
  theta = ifelse(group==0, 1, theta[group])
  
  # simulate event times
  t <- mapply(rapgwph, theta, phi, lambda, gamma, kappa, MoreArgs=list(n=1))
  
  # administrative censoring at t_max
  delta <- 1*(t<t_max)
  t <- pmin(t, t_max)
  
  group <- as.factor(group)
  df <- data.frame(group, theta, phi, lambda, gamma, kappa, t, delta)
  
  return(df)
}

Now we simulate a PH dataset.

apgwph_df <- sim_apgwph_df(k=2, N=350, theta=c(0.5,1.5), phi=2, lambda=0.9, gamma=1.2, kappa=0.9, t_max=1.5)

The head and tail of the dataset is shown below:

group theta phi lambda gamma kappa t delta
0 1 2 0.9 1.2 0.9 0.2481980 1
0 1 2 0.9 1.2 0.9 0.0378416 1
0 1 2 0.9 1.2 0.9 0.2497489 1
0 1 2 0.9 1.2 0.9 0.6283964 1
0 1 2 0.9 1.2 0.9 0.9384928 1
group theta phi lambda gamma kappa t delta
1046 2 0.5 2 0.9 1.2 0.9 0.1731436 1
1047 2 0.5 2 0.9 1.2 0.9 0.5407074 1
1048 2 0.5 2 0.9 1.2 0.9 0.6911447 1
1049 2 0.5 2 0.9 1.2 0.9 0.7754839 1
1050 2 0.5 2 0.9 1.2 0.9 0.4900885 1

We can plot the non-parametric estimates of the cumulative hazards to check if they look proportional (although if the sample size is small it may be hard to tell with any great confidence):

3.4 Simulating PO data with APGW baseline

The odds of an event are defined as the ratio of the probability of the event occurring and the probability of the event not occurring, then by analogy a PO model is given by \[\frac{1-S(t;\boldsymbol{z})}{S(t;\boldsymbol{z})}=\frac{1-S_0(t)}{S_0(t)}\exp(\boldsymbol{\beta}^T\boldsymbol{z})\] where \(S_0(t)=S(t;\boldsymbol{0})\) is the baseline survival function, and \(\frac{1-S(t;\boldsymbol{z})}{S(t;\boldsymbol{z})}\) represents the odds of the event occurring in \((0,t)\) for an individual with covariate vector \(\boldsymbol{z}\).

So for our grouped data, we have \[\frac{1-S_i(t;\boldsymbol{z})}{S_i(t;\boldsymbol{z})}=\frac{1-S_0(t)}{S_0(t)}\theta_i,\qquad i=1,\ldots,k\] where \(S_0(t)\) is the baseline survival function, and \(\theta_i>0\) is the odds ratio between group \(i\) and the baseline group. This relation can be rewritten in terms of cumulative hazards: \[H_i(t) = \log\left( 1+ \theta_i\left[\exp(H_0(t))-1\right]\right).\]

We have \[H_0(t)=\lambda\frac{\kappa+1}{\kappa}\left[\left(1+\frac{(\phi t)^\gamma}{\kappa+1}\right)^\kappa -1\right]\] so then the CDF for group \(i\) is given by \[F_i(t)=1-\exp(-H_i(t))=1-\exp\left(-\log\left( 1+ \theta_i\left[\exp(H_0(t))-1\right]\right)\right)\]

and we obtain the inverse CDF \[F_i^{-1}(t)=\frac{1}{\phi}\left\lbrace(\kappa+1)\left[\left(\frac{-\kappa}{\lambda(\kappa+1)}\log\left(\frac{\theta_i(1-t)}{\theta_i+t(1-\theta_i)}\right)+1\right)^{\frac{1}{\kappa}}-1\right]\right\rbrace^{\frac{1}{\gamma}}.\]

We can verify that if \(\theta_i=1\) then this reduces to the inverse CDF for the APGW (i.e. the expression inside the \(\log()\) becomes \(1-t\)).

The inversion sampling algorithm for generating an event time for group \(i\) is as follows:

  1. Generate \(u\) from \(U\sim\mathrm{U}(0,1)\).
  2. Set \(t=F_i^{-1}(u)\).
rapgwpo <- function(n, theta, phi, lambda, gamma, kappa){
  u <- runif(n=n, min=0, max=1)
  x <- (1/phi) * ((kappa+1)*((-kappa/(lambda*(kappa+1))*log((theta*(1-u))/(theta+u*(1-theta)))+1)^(1/kappa)-1))^(1/gamma)
  return(x)
}

We can now simulate the data in exactly the same way as for the PH simulations.

sim_apgwpo_df <- function(k, N, theta, phi, lambda, gamma, kappa, t_max){
  
  group <- rep(seq(0,k), each=N)
  theta = ifelse(group==0, 1, theta[group])
  
  # simulate event times
  t <- mapply(rapgwpo, theta, phi, lambda, gamma, kappa, MoreArgs=list(n=1))
  
  # administrative censoring at t_max
  delta <- 1*(t<t_max)
  t <- pmin(t, t_max)
  
  group <- as.factor(group)
  df <- data.frame(group, theta, phi, lambda, gamma, kappa, t, delta)
  
  return(df)
}

Now we simulate a PO dataset (same size and number of groups as the PH dataset).

apgwpo_df <- sim_apgwpo_df(k=2, N=350, theta=c(0.5,1.5), phi=2, lambda=0.9, gamma=1.2, kappa=0.9, t_max=1.5)

The head and tail of the dataset is shown below (it is essntially the same as the PH dataset, except now \(\theta\) represents an odds ratio rather than a hazards ratio).

group theta phi lambda gamma kappa t delta
0 1 2 0.9 1.2 0.9 0.1930210 1
0 1 2 0.9 1.2 0.9 1.0122512 1
0 1 2 0.9 1.2 0.9 0.8161682 1
0 1 2 0.9 1.2 0.9 0.4215797 1
0 1 2 0.9 1.2 0.9 0.5169450 1
group theta phi lambda gamma kappa t delta
1046 2 0.5 2 0.9 1.2 0.9 0.5404060 1
1047 2 0.5 2 0.9 1.2 0.9 0.9803349 1
1048 2 0.5 2 0.9 1.2 0.9 0.6570405 1
1049 2 0.5 2 0.9 1.2 0.9 0.3615188 1
1050 2 0.5 2 0.9 1.2 0.9 0.7852030 1

Now we plot the odds to check proportionality. This is not straightforward to do directly using \(\text{flexsurv}\), so we use \(\text{survfit}\) to fit and plot the non-parametric estimates of the odds functions. Later we will overlay the odds functions for fitted models using the \(\texttt{fn}\) argument of the plot method in \(\text{flexsurv}\).

3.5 Fitting PH model with APGW baseline

Specify a custom distribution which fits a PH model where baseline (group 0) is APGW. Note: at the moment it doesnt set theta=1 for group 0, so none of the groups need be APGW. We want group 0 to be APGW and then all the other groups to be fixed by PH assumption. At the moment, there is some fictional group that is APGW, and the groups are PH with respect to that group, which isn’t quite the same thing.

# hazard function
hapgwph <- function(x, theta, phi, lambda, gamma, kappa, log=FALSE){
  theta * hapgw(x, phi, lambda, gamma, kappa, log=FALSE)
}

# cumulative hazard function
Hapgwph <- function(x, theta, phi, lambda, gamma, kappa, log=FALSE){
  theta * Hapgw(x, phi, lambda, gamma, kappa, log=FALSE)
}

custom.apgwph <- list(
  name="apgwph",
  pars=c("theta","phi","lambda","gamma","kappa"),
  location="theta",
  transforms=c(log,log,log,log,kappa_transform),
  inv.transforms=c(exp,exp,exp,exp,kappa_inv_transform),
  inits=function(t)c(1,1,1,1,1)
)

3.6 Fitting PO model with APGW baseline

Specify a custom distribution which fits a PO model where baseline (group 0) is APGW. Note: at the moment it doesnt set theta=1 for group 0, so none of the groups need be APGW. We want group 0 to be APGW and then all the other groups to be fixed by PO assumption. At the moment, there is some fictional group that is APGW, and the groups are PO with respect to that group, which isn’t quite the same thing.

# hazard function
hapgwpo <- function(x, theta, phi, lambda, gamma, kappa, log=FALSE){
  num = theta * hapgw(x,phi,lambda,gamma,kappa)
  den = exp(-Hapgw(x,phi,lambda,gamma,kappa))+theta*(1-exp(-Hapgw(x,phi,lambda,gamma,kappa)))
  return(num/den)
}

# cumulative hazard function
Hapgwpo <- function(x, theta, phi, lambda, gamma, kappa, log=FALSE){
  log(1+(exp(Hapgw(x,phi,lambda,gamma,kappa))-1)*theta)
}

custom.apgwpo <- list(
  name="apgwpo",
  pars=c("theta","phi","lambda","gamma","kappa"),
  location="theta",
  transforms=c(log,log,log,log,kappa_transform),
  inv.transforms=c(exp,exp,exp,exp,kappa_inv_transform),
  inits=function(t)c(1,1,1,1,1)
)

3.7 Fit PH and PO models to the PH data

# fit PO model to PH data
fs_apgwpo_phdf <- flexsurvreg(Surv(t,delta) ~ group,
                   data = apgwph_df, 
                   dist = custom.apgwpo) 
# fit PH model to PH data
fs_apgwph_phdf <- flexsurvreg(Surv(t,delta) ~ group,
                   data = apgwph_df, 
                   dist = custom.apgwph) 

Now plot the survival functions to check general model fit and the hazards to check model fit and hazard (non) proportionality.

3.8 Fit PH and PO models to the PO data

# fit PO model to PO data
fs_apgwpo_podf <- flexsurvreg(Surv(t,delta) ~ group,
                   data = apgwpo_df, 
                   dist = custom.apgwpo) 
# fit PH model to PO data
fs_apgwph_podf <- flexsurvreg(Surv(t,delta) ~ group,
                   data = apgwpo_df, 
                   dist = custom.apgwph) 

Now plot the survival functions to check general model fit and the odds functions to check model fit and odds (non) proportionality.

We have already made some code that plots the non-parametric estimate of the odds function (transform of the KM estimate for the survival function). To plot the odds functions for the fitted model, we need to make use of the \(\texttt{fn}\) argument of \(\texttt{plot.flexsurvreg}\) (see documentation for how \(\texttt{fn}\) needs to be formatted). Since we already have functions that evaluate the cumulative hazard functions for these models, the most convenient form is to use \[O(t)=\frac{1-S(t)}{S(t)}=\frac{1-\exp(-H(t))}{\exp(-H(t))}=\exp(H(t))-1.\]

odds_apgwph <- function(t, start, theta, phi, lambda, gamma, kappa){
  exp(Hapgwph(t,theta,phi,lambda,gamma,kappa))-1
}

odds_apgwpo <- function(t, start, theta, phi, lambda, gamma, kappa){
  exp(Hapgwpo(t,theta,phi,lambda,gamma,kappa))-1
}

4 Experiments: Fibres Strength Data

The following data is from Table 4.1 in Statistical Analysis of Reliability Data by Crowder et al.

It shows the failure stresses (in GPa) of single carbon fibres of four different lengths (1/10/20/50mm). There is no censoring in this dataset, and while the outcome does not represent a ‘time’, it is non-negative and so can be modelled as such.

The head the dataset is shown below.

length fail_stress
1 2.247
1 2.640
1 2.842
1 2.908
1 3.099

We a censoring indicator (all equal to 1) so that we can create survival objects needed to apply the methods used later.

length fail_stress delta
1 2.247 1
1 2.640 1
1 2.842 1
1 2.908 1
1 3.099 1

Make some exploratory plots. First, plot the survival curves (this may help indicate suitable assumptions about the hazards, e.g. proportional hazards becomes a power law in the survival functions). Second, plot the hazards and the odds, so we can see if PH or PO assumptions look reasonable.

## Warning in muhaz(times = c(`1` = 2.247, `2` = 2.64, `3` = 2.842, `4` = 2.908, : maximum time > maximum Survival Time
## Warning in muhaz(times = c(`1` = 2.247, `2` = 2.64, `3` = 2.842, `4` = 2.908, : maximum time > maximum Survival Time
## Warning in muhaz(times = c(`1` = 2.247, `2` = 2.64, `3` = 2.842, `4` = 2.908, : maximum time > maximum Survival Time

If we accept that the survival curves appear to be location-shifts of one another, this would indicate AFT is appropriate.

A better way to test the PH and PO assumptions is as follows (from p. 81 and p. 84)

Here is the plot of log-log survivor against \(\log t\):

These curves look relatively parallel, so PH might be appropriate (note: they are not straight lines, which rules out Weibull).

The book suggests that the 1mm and 10mm curves appear to be respectively concave and convex, so PO is probably not valid.

4.1 Fitting the APGW PH and PO models

We now fit the PH/PO with APGW baseline models to the fibres data and see what happens.

# fit PO model
fs_apgwpo_fdf <- flexsurvreg(Surv(fail_stress, delta) ~ length, 
                              data = fibres_df,
                              dist = custom.apgwpo) 
# fit PH model
fs_apgwph_fdf <- flexsurvreg(Surv(fail_stress, delta) ~ length, 
                              data = fibres_df,
                              dist = custom.apgwph) 

Now plot the model fit:

## Warning in muhaz(times = c(`1` = 2.247, `2` = 2.64, `3` = 2.842, `4` = 2.908, : maximum time > maximum Survival Time
## Warning in muhaz(times = c(`1` = 2.247, `2` = 2.64, `3` = 2.842, `4` = 2.908, : maximum time > maximum Survival Time
## Warning in muhaz(times = c(`1` = 2.247, `2` = 2.64, `3` = 2.842, `4` = 2.908, : maximum time > maximum Survival Time